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In a recent paper we presented a linear scaling Kohn-Sham density functional theory (DFT) code 
based on Daubechies wavelets, where a minimal set of localized support functions is optimized in 
situ and therefore adapted to the chemical properties of the molecular system. Thanks to the 
systematically controllable accuracy of the underlying basis set, this approach is able to provide 
an optimal contracted basis for a given system: accuracies for ground state energies and atomic 
forces are of the same quality as an uncontracted, cubic scaling approach. This basis set offers, by 
construction, a natural subset where the density matrix of the system can be projected. In this 
paper we demonstrate the flexibility of this minimal basis formalism in providing a basis set that 
can be reused as-is , i.e. without reoptimization, for charge-constrained DFT calculations within a 
fragment approach. Support functions, represented in the underlying wavelet grid, of the template 
fragments are roto-translated with high numerical precision to the required positions and used as 
projectors for the charge weight function. We demonstrate the interest of this approach to express 
highly precise and efficient calculations for preparing diabatic states and for the computational setup 
of systems in complex environments. 


I. INTRODUCTION 

Density functional theory (dftP^i 

is arguably the 

most popular approach to electronic structure calcula¬ 
tions for a wide range of systems. However, it suffers 
from various well-known limitations, like for example the 
self-interaction problenP® which can result in electron 
delocalization errors, and the fact that it is in principle 
a ground state theory only. For these reasons, the DFT 
formalism has been extended in the form of constrained 
DFT (C DFT) 5 1 to include an additional constraint on 
the density, so that the lowest energy state satisfying a 
given condition can instead be found. When a reasonable 
guess for such a condition is at hand, it can therefore be 
used both to find a particular excited state of the system 
and to localize the electronic density in such a way as 
to prevent spurious delocalization, and thus provides a 
way of overcoming the above problems. Of course, time- 
dependent DFT (TDDFT) 6 can be used to find multiple 
excited states, however when one is interested in a partic¬ 
ular excited state CDFT can be advantageous, especially 
given that the additional costs associated with adding a 
constraint are relatively low. Furthermore, TDDFT also 
suffers from self interaction problems, and can give inac¬ 
curate results for certain types of excited states, including 
charge transfer excitations. 

Constrained DFT has been implemented in a num¬ 
ber of codes, using both localized basis sets 7 ! and plane 
waved^, and has been successfully applied in a vari¬ 
ety of contexts, including charge constrained molecular 
dynamic^, the calculation of the correct energy align¬ 
ment of metal/molecule interface^ 0 and the calculation 
of electronic coupling matrix elements 1112 . For a general 
overview of CDFT see Ref. [13] 

As with all DFT calculations, the choice of basis set 
has a large impact on both the accuracy and computa¬ 


tional cost of CDFT. One way of accessing large systems 
is to reformulate the standard cubic scaling approach to 
DFT in terms of localized orbitals, or ‘support functions’, 
which we will discuss further in the following section. We 
wish to perform CDFT calculations on large systems us¬ 
ing such an approach, whilst maintaining the high accu¬ 
racy associated with systematic basis sets. As such, we 
require a basis set which is at the same time localized 
and systematic. For this reason, we have chosen to use a 
Daubechies wavelet basis setP^l, as it is a systematic basis 
set exhibiting the desired properties of compact support 
in both real and Fourier space and can be chosen to be 
orthogonal. Wavelet basis sets have an inherent flexibil¬ 
ity, in that they allow for multiresolution grids, which 
is particularly useful for inhomogenous systems. Com¬ 
bined with the ability to explicitly treat charged systems 
in open boundary conditions, wavelets provide an ideal 
basis set for accurate CDFT calculations of large systems. 

In this paper, we show that the combination of a sup¬ 
port function approach with a wavelet basis set allows 
for the definition of a flexible fragment based approach to 
CDFT, which can further reduce the computational cost, 
particularly for very large systems. In this approach, 
a set of support functions are optimized for an isolated 
(small) molecule, or ‘fragment’, and reused as a fixed ba¬ 
sis in a larger system containing many of these molecules 
e.g. a solvent. It is then straightforward to associate the 
constrained charge with a given fragment using a Lowdin 
like definition of the CDFT weight function. However, in 
the larger system each molecule may well have a different 
orientation and so the support functions, which are de¬ 
scribed in terms of the fixed wavelet grid, cannot simply 
be duplicated for each molecule. 

Therefore, we have developed a scheme to reformat the 
support functions for arbitrary roto-translations using in¬ 
terpolating scaling functions. This interpolation, thanks 
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FIG. 1. Least asymmetric Daubechies wavelet family of order 
2m = 16; both the scaling function <p(x) and wavelet ^(x) 
differ from zero only within the interval [1 — ra, m \. 

to the properties of the underlying basis set, results in 
only a negligible loss of accuracy and so the support func¬ 
tions can be directly reused, reducing the computational 
cost by an order of magnitude compared to optimizing 
the support functions from scratch for the full system. 

In the following sections we will first summarize our 
approach to large scale DFT calculations using localized 
support functions represented in a wavelet basis set! 15 , as 
implemented in the BigDFT electronic structure code 16 . 
We will then outline our implementation of CDFT, fol¬ 
lowing which we will explain our fragment approach, in¬ 
cluding a description of the reformatting scheme, vali¬ 
dating our method with calculations on prototypical sys¬ 
tems. Finally, we will present an application of CDFT for 
the fullerene Ceo in two different environments, through 
which we will demonstrate the flexibility and potential of 
a fragment based approach. 

II. METHODOLOGY 
A. Linear scaling DFT with wavelets 

We and others have recently presented a newly de¬ 
veloped method for DFT calculations on large systems, 
which combines the use of a minimal localized basis 
of ‘support functions’ with the use of an underlying 
wavelet basis setP^. This method has been implemented 
in BigDFT, which uses the orthogonal least asymmet¬ 
ric Daubechies family of order 16, which are depicted in 
Fig. 0 The Kohn-Sham (KS) orbitals are expressed in 
terms of the support functions via a set of coefficients cf : 

I*i> = £c?l0«>, (!) 

a 

where the support functions are represented directly in 
the wavelet basis set localized on a 3 dimensional grid, 
so that they can be thought of as adaptively contracted 
wavelets. Rather than working directly with the KS or¬ 
bitals, we instead work in terms of the density matrix, 
p(r, r'), which is itself defined in terms of the support 
functions and the density kernel, K a 

P(r,r') = 'Yl<t>oc{r)K al 3 <t>f } {r'). (2) 

a,/3 


The density matrix has been shown to decay exponen¬ 
tially with distance for systems with a gap thanks to 
the so-called nearsightedness principle-^^, an d thus a 
formulation in terms of the density matrix allows us to 
take advantage of this to achieve linear scaling with the 
number of atoms in the system, thereby avoiding the cu¬ 
bic scaling of standard approaches to DFT. From this 
the charge density is calculated directly from the sup¬ 
port functions and density kernel. Similarly, the band 
structure energy and the charge of the system can be 
calculated from the density kernel via: 

E bs = Tr [KH] , N = Tr [KS] (3) 

where H indicates the Hamiltonian matrix in the basis 
of the support functions, and S is the support function 
overlap matrix. 

The support function formalism allows one to map the 
degrees of freedom of KS orbitals into a localized descrip¬ 
tion, that can be directly put in relation with atomic po¬ 
sitions. In practice, the support functions are truncated 
within spherical localization regions with a user-defined 
radius, and some additional truncation must be applied 
to the density kernel which is then exploited via sparse 
matrix algebra to achieve a fully linear scaling algorithm. 
Therefore, to some extent, a given support function (j) a 
can be associated to the atom a where its localization 
region is centered. In order to achieve accurate results, 
both the support functions and density kernel are opti¬ 
mized during the calculation, that is the energy is min¬ 
imized with respect to both quantities. Providing the 
localization regions are sufficiently large, this results in a 
minimal localized basis set with an accuracy equivalent 
to the underlying basis set. 

The general scheme is common to other basis optimiza¬ 
tion for density-matrix minimization based linear scal¬ 
ing DFT codes, e.g. ONETEI® and ConquesfP^, with 
the addition of a few novel features. These include the 
application of a confining potential to the KS Hamilto¬ 
nian, which ensures the support functions remain local¬ 
ized. In order to apply the confining potential consis¬ 
tently, we also enforce an approximate orthogonality con¬ 
straint on the support functions, in contrast with other 
appr oache s which use fully non-orthogonal support func¬ 
tions 23 24 . Furthermore, the properties of the description 
in the wavelet basis are such that the algorithm guar¬ 
antees that the Pulay contribution to the atomic forces 
can safely be neglected, so the forces can be calculated 
accurately and cheaply. 

The method can be divided into two key components: 
the optimization of the support functions (either with 
or without a confining potential), and the optimization 
of the density kernel. This latter point can be achieved 
via a choice of schemes, as detailed previousl}^. These 
include a direct minimization approach, where the coeffi¬ 
cients cf are first updated using DIIS or steepest descents 
to minimize the band structure energy and then used to 
construct the density kernel, and the Fermi Operator Ex¬ 
pansion (FOE) method, where the density kernel is ex- 
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pressed as a function of the Hamiltonian matrix which 
can be evaluated numerically using a Chebyshev polyno¬ 
mial expansion. Close attention has also been paid to 
the parallelization of the code, such that massively par¬ 
allel machines can be exploited to perform large scale 
calculations (see also Ref. 2 ^). For charged calculations, 
we have found the use of the direct minimization method 
to be the most suitable, due to a reduction in the oc¬ 
currence of charge sloshing during convergence, and the 
flexibility afforded by working with the wavefunction co¬ 
efficients rather than directly with the density kernel as 
in the FOE method. 


B. Atomic charge analysis 

The mapping between electronic and localized degrees 
of freedom which is provided by the support function for¬ 
malism allows one to perform an accurate atomic charge 
analysis, meaning that each atom is assigned a partial 
net charge, such that the electrostatic properties of the 
system are conserved. Obviously this conservation is 
only possible within a certain limit, as one is mapping 
a continuous quantity (the electronic charge) to a dis¬ 
crete quantity (the atomic point charges). If, however, 
the error introduced by this mapping onto point charges 
is small enough, the system under investigation can be 
reasonably approximated by a simple setup of charged 
point particles, which paves the way for future applica¬ 
tions such as coupling different levels of accuracy within 
the same calculation. 

Given the overlap matrix S and the density kernel K, 
the partial charge located on atom a can be defined by 
the so-called Lowdin charge: 

q a =±(s^KS^) aiai , (4) 

oc' 

where the sum runs over all support functions a' which 
are located on atom a. Obviously q a = tr(KS) = TV, 
i.e. the total charge (the monopole) is conserved. In order 
to check whether higher multipoles can also be conserved, 
we compared the dipole moment calculated using this ap¬ 
proach with that calculated using the continuous charge 
density. In addition a comparison with the dipole mo¬ 
ment calculated with the cubic scaling version of BigDFT 
was done as a reference. The values for a strongly polar¬ 
ized molecule (H 2 0) and a non-polarized one (C 60 ) are 
given in Tab. [T| 

As a second test of the reliability of our method we 
directly compared the atomic point charges with those 
calculated by performing a Bader charge analysis of the 
charge density calculated using the cubic scaling ap¬ 
proach. As can be seen from Tab. [ilj the differences 
between the exact results are smaller with our approach. 
In particular, for the C 60 fullerene, reasons of symmetry 
impose that the charge should be equally distributed and 
no atom should carry a net charge. As can be seen, the 


Lowdin procedure comes closer to this result than the 
Bader analysis. 

C. Constrained DFT 

The general idea of constrained DFT is to force a 
charge to remain localized in a given region of the simu¬ 
lation space. This is achieved via the addition of a La¬ 
grange multiplier term to the Kohn-Sham energy func¬ 
tional which enforces a given constraint on the resulting 
electronic density, so that rather than being the ground- 
state density of the system, the density instead corre¬ 
sponds to a particular excited state. This Lagrange mul¬ 
tiplier can also be thought of as an additional applied 
potential, otherwise referred to as the constraining po¬ 
tential. The constraint can also take a number of other 
forms, but for the purposes of this work we are inter¬ 
ested only in constraining the charge. The new functional 
therefore becomes: 

W Ip, V c J = E ks [p] + v c (^jw c (r) p (r) dr - N c ) , (5) 

where Eks is the Kohn-Sham energy functional, V c is the 
aforementioned Lagrange multiplier, N c is the required 
charge within the specified region and w c (r) is a weight 
function which defines this region. The weight function 
and N c are defined in advance, however the value of V c 
which correctly enforces the constraint must be found 
during the calculation. Wu and Van Voorhis 26 demon¬ 
strated that the lowest energy state for which the con¬ 
straint is correctly applied is in fact a maximum with 
respect to V c and so it becomes possible to efficiently 
determine the correct V c . It is also straightforward to 
add multiple constraints to the system, and indeed one 
is frequently interested in constraining the charge differ¬ 
ence between two regions. This feature is important for 
the simulation of charge-transfer excitations within the 
CDFT formalism. 

Rewriting the new functional (Eq. [5| in density matrix 
form ad^ 

W [p, V c ] = E ks \p\ + V c (Tr [Kw c ] - N c ), (6) 

the charge constraint is easily added to the existing algo¬ 
rithm in BigDFT. This construction requires the weight 
matrix, which is defined via the weight function as 

w a p = j <p a (r) w c (r) 4>p (r) dr. (7) 

It then remains to define the weight function, for which a 
number of different schemes exist. The support function 
approach used here lends itself to a Lowdin like defini¬ 
tion, which is analogous to that used above to determine 
atomic charges. Using this approach we directly con¬ 
struct the weight matrix via 

U>a0=(S5PS*) , (8) 

V / <xj3 
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cubic 

linear 


exact dipole 

exact dipole 

point charge approx. 

h 2 0 

(0.463, -0.506, -0.186) 

(0.466, -0.510, -0.187) (0.606, -0.668, -0.247) 

norm 

0.711 

0.716 

0.935 

dex ' d 

1 

0.9999996 

0.9999897 

^60 

(-0.0004, -0.0004, -0.0004) 

(-0.025, -0.025, -0.025) (-0.055, -0.055, -0.055) 


TABLE I. Dipole moments calculated using the exact charge density for the cubic and linear scaling approaches, respectively, 
and using the partial atomic point charges. All values are given in atomic units. 


cubic - Bader linear - Lowdin 
H 2 0 (-1.27, 0.61, 0.67) (-0.83, 0.42, 0.41) 
C 60 0.061 ± 0.045 0.003 ± 0.002 


TABLE II. Atomic point charges, calculated by a Bader anal¬ 
ysis of the charge density from the cubic scaling approach, 
and the Lowdin procedure using the density kernel and over¬ 
lap matrix from the linear scaling approach. For H 2 0 we in¬ 
dicate the values on all three atoms, for C 60 we give the mean 
of the absolute values together with the standard deviation. 
All values are given in atomic units. 


where S is the overlap matrix between support functions 
and P is a projection matrix, defined as 1 for all support 
functions belonging to the region where a constraint is 
being applied, and 0 elsewhere. Alternatively, if one is 
constraining a charge difference between two regions, it 
should be set to 1 on one of the regions, —1 on the other, 
and 0 elsewhere. 

The final step, once the weight function has been de¬ 
fined, is to derive a scheme for finding the correct value of 
V c for a given charge constraint value, N c . There are two 
possible approaches to the optimization. In the first ap¬ 
proach, one can find the optimum value of V c at each step 
of the self-consistent density optimization, i.e. the ground 
state density is updated in an outer loop, with V c updated 
in an inner loop. Alternatively, the second approach con¬ 
sists of fully minimizing the functional W [p, V c \ of Eq. [6] 
with respect to the density for a fixed value of V c , up¬ 
dating V c and finding the new minimum density, then 
repeating to convergence, i.e. the maximization with re¬ 
spect to V c is performed in an outer loop with the ground 
state density found self-consistently in an inner loop. We 
chose the latter, as it was observed to be more stable. 
We use Newton’s method to update V c , with the second 
derivative calculated using a finite difference approach. 


D. Fragment approach 

The combination of the novel features described above 
and the use of a wavelet basis set make this approach 
ideal for the application of CDFT to large systems. In 
particular the ability to reuse the support functions can 
result in significant savings for e.g. geometry optimiza¬ 
tions and calculations on charged systems, as previously 



FIG. 2. The fragment approach as illustrated for a cluster of 
water molecules: the support functions are initially optimized 
for an isolated water molecule and then duplicated for a col¬ 
lection of water molecules, avoiding the need for optimization 
in the larger system. 


demonstrated^ 15 !. Furthermore, this idea of support func¬ 
tion reuse can be extended to a fragment based approach, 
which is similar to the so-called fragment orbital method 
which has b een used to calculate electronic coupling ma¬ 
trix element d 12 * 27 * 28 l 

The central idea is to take a group of atoms, or more 
specifically an isolated molecule, and fully optimize the 
support functions. These support functions are then used 
as a fixed basis for a system containing several molecules, 
as illustrated in Fig. [2] for a simple example. We refer to 
the initial molecule for which the support functions were 
optimized as the ‘template’ molecule. 

As the support functions are kept fixed in the frag¬ 
ment approach, w a p need only be calculated once at the 
start of the calculation, after which it remains fixed. Fur¬ 
thermore, due to the quasi-orthogonality of the support 
functions, when the fragment approximation is justified 
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can in general be calculated using a Taylor approxi¬ 
mation, and so the calculation of the weight matrix adds 
very little overhead to the calculation. 

In this work we focus on systems where the respective 
fragments are well defined, and thus the support func¬ 
tions generated from the isolated fragments can be used 
for the full system with a minimal impact on the ac¬ 
curacy. In cases where electrons are being added to a 
fragment, it is important to ensure that the lowest unoc¬ 
cupied molecular orbital (LUMO) and, if necessary, the 
next few states in energy are sufficiently well represented 
by the support function basis. As discussed elsewhere 1 ^, 
this can be achieved using the direct minimization for¬ 
malism to optimize a few additional states during the iso¬ 
lated calculation without adding a charge to the system. 
For systems where the fragments are less well defined the 
implementation could in principle be extended to further 
optimize the support functions for the combined system, 
either in a neutral state or while the charge constraint 
is being enforced. In such cases, the Lowdin approach is 
also expected to be less accurate, and so it would be de¬ 
sirable to use an alternative form for the weight function. 

Finally, it should be mentioned that there are some 
subtleties related to the initial guess for the charge den¬ 
sity. This depends on the initial density kernel, which 
is constructed from the fragment KS orbitals. For neu¬ 
tral calculations it is straightforward to use the fragment 
orbitals and occupancies directly from the isolated calcu¬ 
lations, however for charged calculations some additional 
input is required. One approach would be to occupy the 
fragment orbitals in order of their energies, however this 
can lead to charge distributions which are significantly 
different from the required constraint. This can result 
in slow convergence, or even worse, problems with local 
minima. A better approach should therefore take into 
account the effect of the constraining potential on the 
fragment orbital energies. This can be done by assigning 
occupation numbers so that any excess/deficit in charge 
is localized on the same fragment as the constraint, so 
that the initial density already satisfies the charge con¬ 
straint. Alternatively, the risk of encountering a local 
minimum can be reduced by adding a degree of noise 
to the fragment orbitals, or by completely randomizing 
the initial guess, subject to the correct overall charge. 
However, such an approach is in general much slower to 
converge, and thus the latter strategy is generally not 
recommended. 


E. Reformatting scheme for roto-translations 

As the support functions are defined in terms of an 
underlying grid of wavelets, in order to implement a 
fragment approach it is necessary to have a scheme for 
reformatting the support functions due to a change in 
atomic positions. We are frequently interested in situa¬ 
tions where a molecule has been rotated and translated, 
for example when calculating electronic coupling matrix 


elements in a dimer for varying angles between the two 
monomers. Therefore, we have developed and imple¬ 
mented a scheme for reformatting the support functions, 
given the axis and angle of rotation between some initial 
and final positions for a given fragment mass center. 

A fragment of the system is defined by the user via 
a list of atomic positions, which should of course be in 
bijection with the atom list defining the template frag¬ 
ment. Therefore the first problem is to identify the com¬ 
bination of translation and rotation which sends the tem¬ 
plate fragment to the position of the system’s fragment. 
As a first step, two reference systems are chosen such 
that the fragment center of mass is in the same position. 
This operation is equivalent to finding the translation be¬ 
tween the template and the system. We then have two 
lists of atomic positions, ,<s }, where T and S label 
the template and system fragment respectively, and the 
subscript a, indicating the atom, ranges from 1 to TV, the 
number of atoms in the fragment. 

If the system fragment is a rigid displacement of the 
template (i.e. its internal coordinates are unchanged), 
the rotation matrix 7 Z we seek is such that Rf = 
In general, we should assume there is a 
slight modification of the internal coordinates, as the ge¬ 
ometry of the fragment might be affected by the interac¬ 
tion with the environment. In this case, the matrix 1Z is 
such as to minimize the cost function 

N N 

m) = -j2\\'R s a-'E n °i> R b\\ 2 ■ ( 9 ) 

0-1 b=1 

The determination of the matrix 1Z in such a manner 
constitutes a version of the well-known Kabsch algo¬ 
rithm 29 (also known as Wahba’s problem)®, which can 
be solved by the Singular-Value Decomposition of the 
3-by-3 matrix® Bij = J2a=i(-^a)i(^a)j- After having 
found two matrices U and V and a diagonal matrix S 
such that B = USV 1 , the optimal rotation is 

U = UW V = diag (1,1, det(W) det(V)) . (10) 

The value of J(JZ) defined in ^ might then be used 
to quantify the validity of the rigid transformation ap¬ 
proach. In the case where its value is below a given 
threshold (fixed to 10 -3 in our case), we may proceed 
with the reformatting of the template basis functions, 
which will be denoted b y {(/)&) i n what follows. 

As described in R e f s pSESE2I from the expression of 
\(/>a) in a Daubechies wavelets basis set, the so-called 
“magic-filter” transformation can be used to define a real 
space representation of the basis functions, given in terms 
of one-dimensional interpolating scaling functions (ISF) 

<t>a( x >y> z ) = ^ 2 c ijk t Pi{x)‘Pj(y)iPk{z) , ( 11 ) 

where ipi(x) = (p(x/h — i) is one element of the ISF ba¬ 
sis set, which is constituted of uniform translations of 




6 


the mother function (p(t) over the points of a uniform 
grid of spacing h, covering the entire simulation domain. 
These points are labeled by indices ijk. The points 
r ijk = (hi, hj, hk) therefore lie within the box contain¬ 
ing the support of </>^(r). 

This real-space expression is optimal in the sense that 
it preserves the same moments of the original represen¬ 
tation given in Daubechies wavelets. The interpolating 
property of the ISF basis set is such that \jk)- 

Let us suppose we have a one-dimensional function ex¬ 
pressed in ISF, namely /(x) = fi(Pi(x). We know 
that fi = f(hi). If we want to translate the function / 
by a displacement A, and express this function in the 
ISF basis, we have f(pc + A) = JT //(^(x), with 

= + = ( 12 ) 

3 

where the filter tf = ip(j + A/h) implements the (uni¬ 
form) translation. This filter has a limited extension (the 
same as the function <p(x)) and of course t = Sj t -k- 

Imagine now we have a different ISF basis set {(/?/(x)} 
defined on a uniform grid spacing of separation In and 
a reference frame x/ = h/, which is related to x by a 
more complicated transformation x(x) of the coordinate 
space. If this transformation can be inverted, by x(x), 
then a new function f(x) = /(x(x)) can be defined in this 
frame. For each grid point /, it is then always possible to 
find i in the old frame such as to minimize the absolute 
value of A/ = x(x/) — hi. 

Using the above relations we might approximate 
f(x) ~ YjI fiMZ), where 

fi = />/) = f(hi + A i)=J2 fi-tf 1 ' ( 13 ) 

3 

If the transformation x is a continuous function of x 
which varies slowly enough, this is in general a rather 
good approximation (see Fig. |3|. 

This framework can be easily generalized to a roto- 
translation in three dimensions. Indeed, we would like to 
estimate the function 


= 0a(r(f)) - Y dlJK t Pi(x)tpj(y)ip K (z) , (14) 

I,J,K 

where the coordinates f = (x,y,z) are defined as 

? = 1Z • r (15) 


where 1Z is calculated by Eq. (10). In addition a rigid 


shift vector s = (s x ,s y , s z ) is defined as the difference 
between the coordinates of the center of mass of the two 
fragments. If the rotation is the identity matrix, the 
template reference frame is then f = r + s. As in the 
one dimensional case presented above, the interpolation 
depends on the inverse mapping r(r). We detail in the 
following a procedure to identify such a function. 

The coefficients cijk of (?) can be found in three 
steps. We first start by considering the transformation 


law for x. This transformation can be thought of as a 
function of the template coordinates r: 


x(x, y, z ) = Knx + 1Z 12 y + 1Z 13 z . 


(16) 


In the same spirit as Eq. (13), we may invert Eq.([16j) with 


respect to one template coordinate t = x,y,z into x in 
the system’s reference frame. The choice of the variable 
t depends on the entries of the rotation matrix, and it is 
in general given by the coordinate which is multiplied by 
the coefficient of the highest absolute value in Eq. (16). 


This choice guarantees that the t variable is the one for 
which x — t is slowly varying. Let us imagine t = x for 
this example. We can define the function 

4>a l \x,y,z) = <f%(x(x,y,z) - s x ,y,z) 

= Y ci,3,Wi{x)'Pj{y) i Pk{z) , (17) 

I,j,k 


by proceeding for all j, fc, as described in Eq.(13), to de¬ 
fine the coefficients c/j^. The second step is related to 
the expression of y. Depending on the choice of the vari¬ 
able t in the first step, we have to consider one of these 
three relations: 


hlny = U 21 x + lZ 33 y — lZ 32 z , ( 18 ) 

7 ^ 12 y = 7Z 22 x - 1Z 33 X + 1Z 3 iz , ( 19 ) 

1Z 13 y = 7Z 23 x + lZ 32 x — U 31 y , (20) 

which hold when in the first step t = x,y,z respectively. 
These relations can be derived using the orthogonality of 
the rotation matrix 7Z. This function can now be inverted 
with respect to one of the old variables. Again, this choice 
will depend on the values of the coefficients multiplying 
each variable. 

In our example, we have to consider the relation ( p~8] ) 
as we have chosen t = x in the first step. We choose to 
invert the relation with respect to z, having z = z(x,y,y). 
In this case we will have, as a second step 


= 4> { a\x,y,z(x,y,y) - s z ) 

= Y • (21) 

In the third step, the remaining variable, (which is y for 
the illustrated example), can be directly obtained from 
the inverse relation 

r = n .- 1 ■ f = -R} ■ f , (22) 

which is easier to express as 7 Z is an orthogonal matrix. 
In our case, the final result is therefore 

<t>ln,y,z) = fi { a\x,y(x,y,z) - s y ,z) ■ (23) 

We recall that the definition of depends on the or- 
der of the operation. Here we have chosen to interpolate 
first with respect to x, then z and y. The best choice of 
order depends only on the entries of the matrix 7 Z. 
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1. Accuracy 

In order to assess the accuracy of the reformatting 
scheme, we have applied it to a water molecule under¬ 
going a series of rotations. Support functions were gen¬ 
erated for a template water molecule, using a dense grid 
with a spacing of 0.132 A, and were then reused for water 
molecules in a variety of different orientations using a less 
dense grid with a spacing of 0.185 A. As a point of com¬ 
parison, calculations were also performed for each orien¬ 
tation fully optimizing the support functions with a grid 
spacing of 0.185 A. This allows us to quantify both the 
error introduced by the support function reformatting 
and the errors due to representing the wavefunctions on 
a fixed grid, i.e. the so-called ‘eggbox effect’. The eggbox 
effect of the standard cubic scaling approach is also pre¬ 
sented. The computational setup has been chosen such 
that the difference in ground state energies between the 
cubic and support function approaches is of the order of 
1 meV/atom. 

The results are shown in Fig. [3j where we can see that 
the eggbox effect is of the order of 0.1 meV/atom. As 
both the cubic and linear scaling approaches use the same 
underlying grid, the variation is similar in each case. The 
error due to the interpolation also remains small - less 
than a few meV/atom. Importantly, the overall error for 
the reformatted calculations remains of the same order of 
magnitude as that due to the selected localization radii 
of the support functions. 

III. RESULTS 

Below we present results for three different systems, 
where the first two can be validated against CDFT im¬ 
plementations in other codes. In each case we use the 
local density approximation (LDA) exchange-correlation 
functional 33 and HGH pseudopotentiald^H within isolated 
boundary conditions. For carbon, nitrogen and oxygen 
we use four support functions per atom, for hydrogen we 
use one, and for zinc we use nine. 

A. N 2 

Wu and Van Voorhis have previously studied Nj^and 
so this system provides a useful test case. We used a 
grid spacing of 0.185 A with support function radii of 
7.4 A, i.e. completely filling the simulation cell; here we 
aim to validate only the general correctness of the im¬ 
plementation of CDFT rather than the full fragment ap¬ 
proach. Fixing the bond length at 1.12 A, we have varied 
the charge separation between the two atoms, results for 
which are shown in Fig. [4j For simplicity, the calculations 
were performed only in the spin-averaged sense. Our re¬ 
sults are closer to those obtained by Wu and Van Voorhis 
using a Becke weight population than the Lowdin scheme, 
however given that we have used the LDA whereas they 



e n u 


FIG. 3. Plot showing the energy variation for a water 
molecule rotated through different angles ( 0 ) and axes of ro¬ 
tation (n). Results are shown for the standard cubic scaling 
approach (‘cubic eggbox’), fully optimized support functions 
(‘linear eggbox’) and a fixed support function basis generated 
for a template molecule (‘template’). The cubic reference is 
the energy at the initial orientation calculated using the cu¬ 
bic approach, for the linear and template approaches it is the 
same quantity calculated in the fully optimized support func¬ 
tion basis. There is a roughly constant error of 1 meV/atom 
in the support function basis compared to the cubic scaling 
approach. Selected orientations are shown along the bottom. 



N c 


FIG. 4. Change in energy with respect to an unconstrained 
calculation and applied potential value for differing charge 
separations in N 2 . 


used B3LYP we do not expect exact agreement. Fur¬ 
thermore, we should recall that the fragment approach 
presented here is aimed at systems where the donor and 
acceptor are well separated, whereas the use of support 
functions optimized for an isolated nitrogen atom is nec¬ 
essarily an approximation in this case. Nonetheless, we 
have successfully reproduced the correct trends for both 
the energy and the Lagrange multiplier. 























FIG. 5. The ZnBC-BC model complex. 


B. ZnBC 

As a more elaborate test case we take the 
zincbacteriochlorin-bacteriochlorin (ZnBC-BC) complex, 
which has also b een studied previously in some detail, 
both with CDFT^^and other approaches, e.g. Refs. 36 
and EH This system is ideally suited to our approach as 
the donor and acceptor are clearly separated. Further¬ 
more, TDDFT has been shown to give incorrect energies 
for the ZnBC + -BC _ and ZnBC _ -BC + charge transfer 
(CT) excited stated^ and so the advantages of CDFT 
are clear. It has previously been demonstrated that the 
differences between a (l,4)-phenylene-linked ZnBC-BC 
complex and a model complex where the link is elimi¬ 
nated are smalP^ for simplicity we therefore choose to 
use the latter, where the distance between the two previ¬ 
ously linked carbon atoms is 5.84A, as depicted in Fig. [HJ 
Taking the coordinates from Ref. 37. we relaxed the iso¬ 
lated ZnBC and BC molecules separately, then built the 
model complex without further relaxation. We used a 
grid spacing of 0.185 A and localization radii of 5.82 A. To 
assess the accuracy of the fragment support functions we 
compare the neutral energies for the model complex with 
those obtained using cubic scaling BigDFT. The results 
are shown in Tab. m where we can see that the error 
for both the model complex and the isolated molecules is 
less than 1 meV/atom. 

The energies for the two CT excited states relative 
to the unconstrained DFT ground state are 3.71 eV for 
ZnBC + -BC _ and 3.98 eV for ZnBC _ -BC + , which is 
consistent with previous results^^. We can also gain 
some insight into the nature of these CT states by plot¬ 
ting the difference in the electronic density between the 
neutral and constrained calculations, as in Fig. [6j Not 
only is the charge transfer characteristic clear, the plot 
for ZnBC + -BC - also shows remarkably good agreement 
with previous calculations that used the significantly 
more expensive Bethe-Salpeter approactP^, which con¬ 
firms that CDFT can be used to obtain physically rel¬ 
evant CT excitons, and provide a reliable estimation of 
the corresponding excitation energies. 

We have also plotted the relationship between the con¬ 
straining potential, V c , the total energy relative to the 
unconstrained calculation, A E, and the charge difference 
between the two molecules, N c . This is shown in Fig. [7| 



cubic 

( e 

frag. 

V) 

diff. 

(meV) 

ZnBC 

-4472.626 

-4472.604 

22.6 

BC 

-4471.998 

-4471.980 

17.7 

ZnBC-BC 

-8944.629 

-8944.575 

54.1 

ZnBC“-BC + 

- 

-8940.592 

- 

ZnBC + -BC“ 

- 

-8940.860 

- 


TABLE III. Energies for isolated ZnBC and BC, the neu¬ 
tral model ZnBC-BC complex and the two lowest energy CT 
states, as calculated using standard BigDFT (‘cubic’) and the 
fragment approach (‘frag.’). Where applicable the difference 
between the two approaches is also indicated (‘diff.’). 



(a)ZnBC“-BC+ 



(b)ZnBC + -BC“ 


FIG. 6. Density differences between the neutral and charged 
calculations for the two charge transfer states. Red (blue) in¬ 
dicates an increase (decrease) in the electronic charge density 
with respect to the neutral. 


where N c = 1 corresponds to ZnBC + -BC - and N c = — 1 
corresponds to ZnBC _ -BC + ; our results agree well with 
previous calculation ^ 26 * 35 *, despite the use of a different 
exchange-correlation functional. This test highlights the 
robustness of the method - in order for the correct value 
of V c to be found within a minimal number of iterations 
of the constraint loop, there should be a smooth rela¬ 
tionship between a given V c and the resulting N c . If for 
certain values of V c the convergence is insufficient, such 
that the final charge deviates from the correct value, this 
will negatively impact the search for the correct V c . We 
observed that in general such a smooth curve is straight¬ 
forward to obtain, given a reasonable initial guess for the 
density kernel and therefore charge density. As discussed 
in Section II C[ this can be achieved by defining the ini- 
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-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 

N c 

FIG. 7. Applied potential value and change in energy com¬ 
pared to an unconstrained calculation for differing charge sep¬ 
arations in the ZnBC-BC model complex. 


proach results in values which match experiment much 
better than traditional semilocal functionals, indeed our 
results for the IP and EA of the isolated molecule agree 
very well with the experimental values of 7.6 eV®^® and 
2.7 eV® respectively. In this case, we wish to apply the 
ASCF approach to a molecule in an environment, which, 
as we will show, is easily achieved using CDFT. 

On the other hand, with unconstrained DFT calcula¬ 
tions the use of the ASCF approach is much more deli¬ 
cate when studying environmental effects with LDA: as 
the charge tends to be overly delocalized, charged cal¬ 
culations do not simply represent a perturbation from 
the isolated values, as discussed in more detail below. In 
other words, the calculated energy differences do not cor¬ 
respond to the IP and EA of C6o in an environment, but 
to a completely different quantity. If one wishes to calcu¬ 
late this quantity it is therefore essential to use CDFT. 


tial occupancies in a manner which is consistent with the 
desired charge difference. 


C. C 60 

In order to accurately calculate material properties it 
is important to account for environmental effects, e.g. by 
including a solvent or neighboring molecules in a molec¬ 
ular material. However, this can considerably increase 
the cost of a simulation, as in the case of large systems 
in solution where the solvent must fill a correspondingly 
large volume. Various strategies have been developed for 
reducing the cost, for example by using implicit solvation 
methods^®], however it is frequently desirable to treat 
explicitly the environmental degrees of freedom. Thanks 
to the fragment approach, the treatment of solvents and 
other surrounding molecules can readily be achieved in 
BigDFT with relatively low cost, as we will demonstrate 
through the example of the fullerene Ceo in two different 
environments: when in an aqueous solution and when 
surrounded by other Ceo molecules. For each system we 
constrain a charge of ±1 to the central Ceo molecule in 
order to determine the environmental impact on the ion¬ 
ization potential (IP) and electron affinity (EA). 

For traditional DFT calculations with semilocal func¬ 
tionals like LDA it is well known that the above quanti¬ 
ties are badly estimated by the frontier orbitals, i.e. the 
HOMO (LUMO) for the IP (EA). Therefore, in order to 
extract physically meaningful information, one must ei¬ 
ther use more expensive beyond-DFT approaches, or in¬ 
stead calculate the IP and EA using the so-called ASCF 
method. This is made possible when explicitly charged 
calculations are available, i.e. when charged and neutral 
calculations have energies that can be measured with 
respect to a common reference. The treatment of the 
electrostatic potential which is included in the BigDFT 
code makes such a comparison possible 42 . This latter ap- 


1. Computational details 


There have been a number of previous studies of Ceo in 
water, both experimental and theoretical^^, however 
they have mainly focused on neutral fullerenes. Previous 
research has indicated the existence of a first hydration 
shell surrounding Ceo containing between 60 and 65 wa¬ 
ter molecules^®!, we have therefore chosen to restrict 
ourselves to systems containing 66 water molecules. We 
present results for th ree example structures, which are 
depicted in Fig. |8(a) They were generated by inserting 
the Ceo into water droplets where the water molecules 
were deposited with random orientations at random po¬ 
sitions subject to the room-temperature density of wa¬ 
ter. The structures were then relaxed until the RMS 
forces were below 10 meV/A. For the environment of 
fullerenes, we limit the cost of the simulations by includ¬ 
ing six nearest neighbor fullerenes only, so that the sys¬ 
tem is arran ged as a three dimensional cross, as depicted 
in Fig. 8(b)| Each of the fullerenes was considered in its 
gas-phase structure. 

The fragment calculations were performed with a grid 
spacing of 0.185 A, while the template calculations were 
performed using a denser grid of 0.132 A to ensure ac¬ 
curate reformatting; we used support function radii of 
4.23 A. These values have been chosen such as to en¬ 
sure the applicability of the Lowdin approach for the 
weight matrix on the central C 60 whilst preserving ab¬ 
solute accuracy of the unconstrained calculations to the 
order of 3meV/atom, see data in Sec. |III C 2 In order 
to ensure the support functions are sufficiently accurate 
for the negatively charged calculations for Ceo, the tem¬ 
plate calculation was performed optimizing three addi¬ 
tional states (to account for degeneracies). 

We continue to use the LDA functional as the differ¬ 
ence between the LDA and other treatments like PBE for 
the IP and EA of fullerenes has previously been shown 
to be negligible, and reasonable agreement with exper¬ 
iment has also been observed^. We also neglected the 
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(a)The three configurations A, B and C (left to right) of C 60 in H2O. 
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(b)C 60 with its six nearest neighbors. 
The blue lines are drawn between the 
molecular centers along the axes. 



FIG. 8. The different environments for Ceo- 


modelling of dispersive terms on the C 60 - C 60 interac¬ 
tions due to their negligible impact on frontier orbital 
eigenvalues and on total energy differences in charged 
calculations. 


2. Testing the fragment approach 


The results for the C 60 structure with a center to cen¬ 
ter distance of 10 A are shown in Tab. IlYl with the cor¬ 
responding values for the isolated molecule. We also in¬ 
clude the cubic scaling results as they allow us to assess 
the accuracy of the fragment approach for this system. 
As anticipated, for the isolated molecule the fragment er¬ 
ror is of the order of 0.2 eV, which is about 3meV/atom. 
In order to confirm that this accuracy is preserved, we 
also compared unconstrained cubic and fragment calcu¬ 
lations for the seven Ceo structure. To find an uncon¬ 
strained solution we built the initial guess from the frag¬ 
ment densities in such a manner that the charge was 
equally distributed among fragments; the final solution 
remained close to this charge distribution. We found val¬ 
ues of -3.681 eV and 6.707 eV, i.e. the difference with the 
cubic results is of the same magnitude as for the isolated 
molecule (see Tab. IV). 

We have also investigated the effect of varying the 
separation between the molecules by repeating the con¬ 
strained fragment and (unconstrained) cubic calculations 
with center to center separations ranging from 10 A to 
20 A, which corresponds to a shortest distance between 
molecules of 3.1 A to 13.1 A. The results are plotted in 
Fig.U As expected, for large separations the results tend 
towards the isolated values. For the unconstrained calcu¬ 
lations there is an abrupt change between two different 
states, whereas with CDFT there is not only a smooth 
trend, but also an exponential relationship with distance, 
proving that the fragment approach is sufficiently precise 
to capture such trends. 



isolated 

in H 2 O 

in C 6 o (10 A) 

Q 

cubic frag. 

ABC 

cdft cubic 

-1 

-2.795 -2.589 

-2.017 -2.728 -2.180 

-2.854 -3.803 

+ 1 

7.648 7.783 

7.262 8.033 7.837 

7.526 6.685 


TABLE IV. Energy differences with neutral, i.e. E Q — E° for 
Ceo when isolated and in the two environments. Two values 
are given for the isolated Ceo: that of the fragment approach, 
which in this case merely refers to a fixed support function 
basis as only one fragment is present, and the cubic scaling 
reference. For the results in water, constrained fragment re¬ 
sults are given. For the nearest neighbor results (‘in Ceo’)? 
results are presented for both the constrained fragment and 
(unconstrained) cubic approaches. The unconstrained results 
exhibit stronger deviation from the isolated values, showing 
that the environment is not correctly modeled as it is not 
acting as a perturbation of the system. Units are in eV. 


Thus far, we have only considered calculations with 
a shifting of the template support functions, however we 
also wish to demonstrate the effectiveness for rotated sup¬ 
port functions. To this end, for a distance of 10 A the 
six outer fullerenes were collectively rotated along the z- 
axis by angles of 15°, 45° and 90°, with the orientation 
of the central molecule remaining unchanged. This was 
found to have a negligible impact on the IP and EA, with 
a difference in the values for the various orientations of 
around 0.01 eV for the constrained fragment calculations 
compared with 0.05 eV for the unconstrained cubic calcu¬ 
lations. Such values are too small to be significant com¬ 
pared to the errors associated with the basis. In order to 
determine whether the energies are truly unaffected by 
the orientation, it would be necessary to account for the 
dispersion effects which are not captured by the LDA. 
However, the fact that no spurious errors are introduced 
by the rotating of the support functions serves to further 
confirm the accuracy of the reformatting scheme. 
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R(A) 

FIG. 9. Variation in electron affinity and ionization potential 
for increasing separations, where the distance is measured be¬ 
tween the centers of neighboring molecules. The energy plot¬ 
ted is relative to the isolated value in the respective basis, i.e. 
AE = (e 9 oL - Bisoj.) - (E? ull - .EfU), where Q indicates the 
charge state, ‘full’ refers to the 7 molecule system and ‘isol.’ 
refers to the isolated molecule. 


3. Comparison of environments 


We now return to the results in Tab. hyi in order to 
compare the effect of the two environments. As expected, 
for both environments the constrained values remain rel¬ 
atively close to the isolated results, certainly much closer 
than the unconstrained results for the fullerene environ¬ 
ment. In a sense, when the constraint is enforced, the 
presence of the surrounding molecules could be thought 
of as a perturbation on the isolated state, although the 
strength of the perturbation is clearly much stronger for 
the water. To further explore this, we have also plotted 
differences in the converged electronic densities between 
the neutral and charged calculations in Fig. [lOj The ef¬ 
fects of the charge constraint are clearly visible, with the 
excess charge distributed across the molecules for the cu¬ 
bic calculation and much more clearly localized for the 
constrained calculations. Furthermore, the charge dif¬ 
ference on the central molecule for the constrained cal¬ 
culations clearly retains the same character as the re¬ 
spective isolated density, with the excess or deficit of 
charge also resulting in an induced dipole on the neigh¬ 
boring molecules. As expected, the impact of the water 
is stronger than the neighboring fullerenes, where the 
closer proximity and stronger dipole moment of the wa¬ 
ter molecules results in a stronger deviation from the iso¬ 
lated density difference. Similar behavior has also been 
observed for the water structures which are not depicted. 

As can be seen from the variation in IP and EA for 


structures A, B and C (Tab. IV), the effect of the water is 
not only stronger than that of the neighboring fullerenes, 
but the dependence on the structure is also quite sig¬ 
nificant. Furthermore, we have also observed that the 
resulting energies are strongly affected by the choice of 


weight function, which is not the case for the fullerene 
environment. There is some freedom in the procedure 
for optimizing the template support functions, e.g. the 
localization radii and the number of additional states in¬ 
cluded; we tested a few of the different options. For the 
fullerene environment the variation in calculated IP and 
EA due to the choice of template parameters were small 
and systematic, whereas for the aqueous environment the 
variation was much stronger. Indeed, the fragment ap¬ 
proach provides an ideal setup to explore the impact of 
different weight functions - such a strong dependence 
would be harder to detect when considering only two or 
three different choices. In future the choice of weight 
function could also be decoupled from the fragment basis 
to allow for a more thorough exploration of its influence 
in constraining the charge. 

Of course, in order to correctly assess the impact of 
the environment, one should go beyond the model struc¬ 
tures used here, both in terms of the size and procedure 
used to generate them. Furthermore, in the case of the 
water, proper sampling should be performed over a num¬ 
ber of different configurations, which should be generated 
at the c orrect temperature e.g. using molecular dynam- 
i c j 50 | 52 | Qr M on t; e Carlo 4 ^^ simulations. However, aside 
from the generation of input structures and any even¬ 
tual relaxations of the atomic coordinates, the fragment 
calculations are quick and easy to perform, requiring lit¬ 
tle additional setup aside from the template calculations. 
Furthermore, we have demonstrated both the accuracy 
and flexibility of the fragment approach for such systems. 
As such, given appropriate atomic coordinates this work 
could easily be extended in future to a large number of 
configurations for both environments, or indeed applied 
to other fullerenes or solvents. 


IV. CONCLUSION 

We have presented a method for constrained DFT 
calculations on large systems, using a fragment based 
scheme. This has been implemented in the BigDFT elec¬ 
tronic structure code within a recently developed frame¬ 
work, which uses a basis of localized support functions 
represented in an underlying wavelet grid to achieve lin¬ 
ear scaling behavior with respect to system size while 
retaining the systematic accuracy of the underlying grid. 
The division of a given system into fragments (ideally 
distinct molecules), each with its own associated sup¬ 
port functions leads to a natural approach to CDFT, 
where the charge is constrained to a given fragment via 
a Lowdin like definition of the weight function. This 
Lowdin approach can also be used to straightforwardly 
calculate atomic charges, as we have demonstrated. 

Furthermore, by using a reformatting scheme which 
enables the reuse of support functions for identical frag¬ 
ments, irrespective of their position or orientation in the 
system, we are able to further reduce the cost of simula¬ 
tions by an order of magnitude, as the support functions 
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(e)isolated fragment, Q = +1 (f)cubic, Q = +1 


(g)constrained fragment, 

Q = +1 


(h)constrained fragment, 

Q = +1 


FIG. 10. Density differences between the neutral and charged calculations for the fullerene when isolated, when surrounded by 
other fullerenes (with a center to center distance of 10 A) and when in water (structure A). The densities are plotted on the 
central plane with a logscale, with red (blue) indicating an increase (decrease) in the electronic charge density with respect to 
the neutral. 


can be separately optimized for each ‘template’ fragment 
and used as a fixed basis in the system of interest. The 
properties of the wavelet basis set ease the implementa¬ 
tion of reformatting a numerical field given in real space, 
so that we were able to implement this scheme in a man¬ 
ner which is both efficient and accurate. The flexibility 
of this method, together with the ability of the BigDFT 
code to treat systems with many atoms (see e.g. RefP), 
makes it ideally suited for both neutral and charged cal¬ 
culations on very large systems at modest computational 
cost. 

We have presented results from two previously stud¬ 
ied systems in order to validate our approach, as well as 
an example application. For this latter point, we have 
performed calculations on Ceo in two different environ¬ 
ments, namely a model nearest neighbor system contain¬ 
ing seven fullerenes and in an aqueous solution. The ef¬ 
fects of the constraint are clearly visible in the electronic 
densities, which we have compared to the unconstrained 
and isolated results. We have also shown that the pres¬ 
ence of water has both a stronger impact on the results 
and a stronger dependence on the choice of weight func¬ 
tion than the presence of neighboring fullerenes. 

The reformatting approach described here has another 
key benefit aside from reducing the cost of such calcu¬ 


lations: as the basis set for each fragment will remain 
equivalent following the reformatting, the computational 
setup provided by our approach is ideal where Hamilto¬ 
nian matrix elements of the whole system have to be 
considered. For example, for electronic coupling ma¬ 
trix elements (‘transfer integrals’) between two identical 
monomers, the basis set for each monomer will remain 
equivalent following the reformatting, so that there is no 
ambiguity in the sign of the coupling matrix elements. 
In contrast, for support functions optimized from scratch 
there is no guarantee that the phase for both the support 
functions and wavefunction coefficients will be identical 
between the two molecules and thus the sign of the trans¬ 
fer integral cannot be determined. Indeed, we will be 
publishing results in the near future for such an appli¬ 
cation based on the framework presented in the current 
worlP^. 

In the future we also hope to extend this work to per¬ 
mit calculations on realistic nanoscale devices, using a 
multi-scale approach. In the first instance this would 
involve changing the definition of a fragment to an in¬ 
dividual atom, which naturally leads to a DFT based 
tight-binding like method. This formalism also allows 
the correct definition of an ‘embedded’ approach, where 
different regions of a simulation cell are treated at differ- 
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ent levels of precision, e.g. for a point defect in a bulk 
semiconductor, with higher accuracy close to the defect. 
Work is ongoing in this direction. 
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